A Faster and Simpler Recursive Algorithm for the Lapack Routine Dgels
نویسنده
چکیده
We present new algorithms for computing the linear least squares solution to overdetermined linear systems and the minimum norm solution to underdetermined linear systems. For both problems, we consider the standard formulation min ‖AX−B‖F and the transposed formulation min ‖AX−B‖F , i.e, four different problems in all. The functionality of our implementation corresponds to that of the LAPACK routine DGELS. The new implementation is significantly faster and simpler. It outperforms the LAPACK DGELS for all matrix sizes tested. The improvement is usually 50–100% and it is as high as 400%. The four different problems of DGELS are essentially reduced to two, by use of explicit transposition of A. By explicit transposition we avoid computing Householder transformations on vectors with large stride. The QR factorization of block columns of A is performed using a recursive level-3 algorithm. By interleaving updates of B with the factorization of A, we reduce the number of floating point operations performed for the linear least squares problem. By avoiding redundant computations in the update of B we reduce the work needed to compute the minimum norm solution. Finally, we outline fully recursive algorithms for the four problems of DGELS as well as for QR factorization. AMS subject classification: 65F20, 65Y20.
منابع مشابه
A New Much Faster and Simpler Algorithm for Lapack Dgels
We present new algorithms for computing the linear least squares solution to overde-termined linear systems and the minimum norm solution to underdetermined linear systems. For both problems, we consider the standard formulation min kAX ? BkF and the transposed formulation min kA T X ? BkF , i.e, four diierent problems in all. The functionality of our implementation corresponds to that of the L...
متن کاملA high-performance algorithm for the linear least squares problem on SMP systems
We present new recursive serial and parallel algorithms for the linear least squares problem AX = B, where A is m by n, m n. The algorithms improve performance. This work is an extension of our work on QR factorization 4]. The key idea is to combine the computation of Q T B with the QR factorization, thereby saving computations compared to the standard LAPACK algorithm. Recursion allows us to r...
متن کاملNumerical Algorithms for Linear and Nonlinear AlgebraExperience with a Recursive Perturbation
Recursive algorithms for symmetric indeenite linear systems are considered in the present paper. First, the diiculties with the recur-sive formulation of the LAPACK SYSV algorithm (which implements the Bunch-Kaufman pivoting strategy) are discussed. Then a recursive perturbation based algorithm is proposed and tested. The experiments show that the new algorithm can be about two times faster alt...
متن کاملAlgorithm xxx: an Efficient Algorithm for Solving Rank-Deficient Least Squares Problems
Existing routines, such as xGELSY or xGELSD in LAPACK, for solving rank-deficient least squares problems require O(mn) operations to solve min ‖b−Ax‖ where A is an m by n matrix. We present a modification of the LAPACK routine xGELSY that requires O(mnk) operations where k is the effective numerical rank of the matrix A. For low rank matrices the modification is an order of magnitude faster tha...
متن کاملHigh Performance Relevance Vector Machine on GPUs
The Relevance Vector Machine (RVM) algorithm has been widely utilized in many applications, such as machine learning, image pattern recognition, and compressed sensing. However, the RVM algorithm is computationally expensive. We seek to accelerate the RVM algorithm computation for time sensitive applications by utilizing massively parallel accelerators such as GPUs. In this paper, the computati...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001